Pruning variable selection ensembles

نویسندگان

  • Chunxia Zhang
  • Yilei Wu
  • Mu Zhu
چکیده

In the context of variable selection, ensemble learning has gained increasing interest due to its great potential to improve selection accuracy and to reduce false discovery rate. A novel ordering-based selective ensemble learning strategy is designed in this paper to obtain smaller but more accurate ensembles. In particular, a greedy sorting strategy is proposed to rearrange the order by which the members are included into the integration process. Through stopping the fusion process early, a smaller subensemble with higher selection accuracy can be obtained. More importantly, the sequential inclusion criterion reveals the fundamental strength-diversity trade-off among ensemble members. By taking stability selection (abbreviated as StabSel) as an example, some experiments are conducted with both simulated and real-world data to examine the performance of the novel algorithm. Experimental results demonstrate that pruned StabSel generally achieves higher selection accuracy and lower false discovery rates than StabSel and several other benchmark methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic ensemble selection and instantaneous pruning for regression

A novel dynamic method of selecting pruned ensembles of predictors for regression problems is presented. The proposed method, known henceforth as DESIP, enhances the prediction accuracy and generalization ability of pruning methods. Pruning heuristics attempt to combine accurate yet complementary members, therefore DESIP enhances the performance by modifying the pruned aggregation through distr...

متن کامل

An Empirical Investigation on the Use of Diversity for Creation of Classifier Ensembles

We address one of the main open issues about the use of diversity in multiple classifier systems: the effectiveness of the explicit use of diversity measures for creation of classifier ensembles. So far, diversity measures have been mostly used for ensemble pruning, namely, for selecting a subset of classifiers out of an original, larger ensemble. Here we focus on pruning techniques based on fo...

متن کامل

Tree Pruning for Output Coded Ensembles

Output Coding is a method of converting a multiclass problem into several binary subproblems and gives an ensemble of binary classifiers. Like other ensemble methods, its performance depends on the accuracy and diversity of base classifiers. If a decision tree is chosen as base classifier, the issue of tree pruning needs to be addressed. In this paper we investigate the effect of six methods of...

متن کامل

Ensemble Pruning Via Semi-definite Programming

An ensemble is a group of learning models that jointly solve a problem. However, the ensembles generated by existing techniques are sometimes unnecessarily large, which can lead to extra memory usage, computational costs, and occasional decreases in effectiveness. The purpose of ensemble pruning is to search for a good subset of ensemble members that performs as well as, or better than, the ori...

متن کامل

Ensemble Learning and Pruning in Multi-Objective Genetic Programming for Classification with Unbalanced Data

Machine learning algorithms can suffer a performance bias when data sets are unbalanced. This paper develops a multi-objective genetic programming approach to evolving accurate and diverse ensembles of non-dominated solutions where members vote on class membership. We explore why the ensembles can also be vulnerable to the learning bias using a range of unbalanced data sets. Based on the notion...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1704.08265  شماره 

صفحات  -

تاریخ انتشار 2017